POEM: 1-Bit Point-Wise Operations Based on E-M for Point Cloud Processing

159

݊× 3

Transform

݊× 3

݊× 64

Transform

݊× 64

݊× 1024

MaxPooling

1× 1024

output scores

Real-valued FC layer

Bi-FC layer

[+1, +1, ڮ ,1]

[1.23,0.12, ڮ ,0.66]

ݏ݅݃݊(ڄ)

0.14

ڮ

1.02

ڭ

ڰ

ڭ

0.54

ڮ

1.75

+1

ڮ

1

ڭ

ڰ

ڭ

1

ڮ

+1

ٖ

܉௜ିଵ

܊܉೔షభ

ܟ

܊ܟ

[9,7, ڮ ,5]

[0.12,0.41, ڮ , 0.32]

ל

ߙ

܉

[1.08,2.87, ڮ ,1.60]

EM+STE

STE

ݏ݅݃݊(ڄ)

FIGURE 6.4

Outline of the 1-bit PointNet obtained by our POEM on the classification task. We save

the first and last fully connected layer as real valued, which is with horizontal stripes. We

give the detailed forward and back propagation process of POEM, where EM denotes the

Expectation-Maximization algorithm, and STE denotes Straight-Through-Estimator.

where we set a1 =1 and a2 = +1. Then PRB(·) is equivalent to the sign function, i.e.,

sign(·).

However, The binarization procedure achieved by PRB(x) is sensitive to disturbance

when x follows a Gaussian distribution, e.g., XNOR-Net. That is, the binarization results

are subjected to the noise of the raw point cloud data, as shown in Fig. 6.3. To address this

issue, we first define an objective as

arg min

x

PRB(x)PRB(x + γ),

(6.37)

where γ denotes a disturbance.

Another objective is defined to minimize the geometry distance between x and PRB(x)

as

arg min

x,α

xαPRB(x)2

2,

(6.38)

where α is an auxiliary scale factor. In recent works of binarized neural networks (BNNs)

[199, 159], they explicitly solve the objective as

α =

x1

size(x),

(6.39)

where size(x) denotes the number of elements in x. However, this objective neglects that α

also influences the output of the 1-bit layer. In contrast, we also consider this shortcoming

and modify this learning object for our POEM.

6.3.2

Binarization Framework of POEM

We briefly introduce the framework based on our POEM, as shown in Fig. 6.4. We extend

the binarization process from 2D convolution (XNOR-Net) to fully connected layers (FCs)

for feature extraction, termed 1-bit fully connected (Bi-FC) layers, based on extremely

efficient bit-wise operations (XNOR and Bit-count) via the lightweight binary weight and

activation.